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In complex systems, crucial parameters are often subject to unpredictable changes in time. Cli¬ 
mate, biological evolution and networks provide numerous examples for such non-stationarities. 

In many cases, improved statistical models are urgently called for. In a general setting, we study 
systems of correlated quantities to which we refer as amplitudes. We are interested in the case 
of non-stationarity, i.e., seemingly random covariances. We present a general method to derive 
the distribution of the covariances from the distribution of the amplitudes. To ensure analytical 
tractability, we construct a properly deformed Wishart ensemble of random matrices. We apply our 
method to financial returns where the wealth of data allows us to carry out statistically significant 
tests. The ensemble that we find is characterized by an algebraic distribution which improves the 
understanding of large events. 
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I. INTRODUCTION 

In electroencephalography (EEG) electrical currents 
are recorded at different positions on the scalp to mea¬ 
sure the brain activity. The correlations between the time 
series of these currents strongly depend on the overall 
state of the brain. During an epileptic seizure, for exam¬ 
ple, the correlations are much stronger than in normal 
periods mm- This time dependence of the correlations 
is the kind of non-stationarity that we wish to address. 
Non-stationarities are also seen when wave packets travel 
through disordered systems. Even if the disorder is static, 
the correlations between the wave intensities measured 
at different positions versus time will change, when the 
direction or the composition of the wave packet is al¬ 
tered GHp. Finance provides another important exam¬ 
ple for this type of non-stationarity. The correlations 
between stock price time series change in time, just as 
the business relations between the firms and the traders’ 
market expectations mm- Similar non-stationarities 
exist in many complex systems, including velocity fluc¬ 
tuations in turbulent flows, heartbeat dynamics, series of 
waiting times, etc. jl2Hl5j . 

A system showing non-stationary correlations may be 
interpreted as being out of equilibrium, implying that 
some of the key tools in statistical physics are not ap¬ 
plicable. Yet, the challenges are similar to the one faced 
for equilibrium systems: Is there generic or universal be¬ 
havior? — How can we identify it? - Can we set up 
statistical models for these non-stationarities? - In the 
context of finance, we recently put forward a random ma- 
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trix approach to tackle these issues nu. We also succes- 
fully applied it in a study of credit risk and its impact on 
systemic stability m- Inspite of the conceptual differ¬ 
ences, random matrix theory [HHU formally has much 
in common with statistical mechanics. Observables are 
averaged over an ensemble; in statistical mechanics, it 
usually is the microcanonical, canonical or macrocanon- 
ical one, in random matrix models, it is an ensemble of 
those matrices which describe or characterize the system. 
In the context of the present discussion, random matrix 
models can be divided into two classes: 

1. The ensemble is fictitious. It comes into play via 
an ergodicity argument only. 

2. The ensemble really exists and can be identified in 
the system. The issue of ergodicity does not arise. 

The vast majority of random matrix models in, e.g ., 
quantum chaos falls into class 1, for a review see Ref. US- 
One is interested in the spectral statistics of one indi¬ 
vidual system. Its Hamiltonian is viewed as a random 
matrix, whose dimension is eventually sent to infinity. 
Ergodicity holds in this limit, meaning that a smooth¬ 
ing energy average of an observable over one individual 
spectrum equals the average over an ensemble of random 
matrices. A noticeable exception are random matrix ap¬ 
plications to quantum chromodynamics [20| . In lattice 
gauge theory, the quarks first propagate in frozen config¬ 
urations of the gauge fields, before an average over the 
gauge fields, modeled by random matrices, is carried out 
as second step. This clearly belongs in class 2. The fluc¬ 
tuating gauge fields truly exist, the partition function 
involves an integral over them. Ergodicity reasoning is 
not evoked. 

There are numerous applications of random matrix 
theory in finance HIM] which address statistical prop¬ 
erties of correlation matrices. Many of them also deal 
with non-Gaussian ensembles. To the best of our knowl¬ 
edge, all of these applications fall into class 1, because 
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one is interested in the statistics of one individual corre¬ 
lation matrix, measured at one particular instant in time. 
In our study m, we put forward a first application of 
random matrices in finance that belongs in class 2. Non- 
stationarity makes the covariances fluctuate and thereby 
creates an ensemble of covariance matrices which we ap¬ 
proximated by a Gaussian Wishart ensemble of random 
matrices [32]. We derived how the multivariate distribu¬ 
tion of dimensionless price changes, referred to as returns, 
acquires heavy tails due to the non-stationarity. Hence, 
we showed that the non-stationarities indeed have uni¬ 
versal features. 

Here, we have three goals: First, we present in Sec. [II] 
a statistically significant way to construct a proper and 
analytically tractable random matrix ensemble from the 
data. We emphasize that this is an important issue for 
random matrix models in the context of correlations. In 
contrast to quantum chaos, where universality holds on 
the scale of the mean level spacing, there is not such a 
local scale when studying statistical properties of correla¬ 
tion matrices. Thus, a Gaussian assumption is not always 
justified and it does matter what the ensemble looks like 
in reality. In particular, realistic ensembles considerably 
help to understand and model large events. Our con¬ 
struction is general and not tied to any specific system. 
Its merit lies in the fact that once the enemble is known, 
it can be used to work out generic statistical properties of 
any observable depending on the correlation matrices, see 
Ref. [T7| for an example. Second, we apply our approach 
to financial data in Sec. Ill We identify an algebraic en¬ 
semble, which is quite relevant for risk estimation. Third, 
we discuss two issues arising in our general construction 
in Secs. IV and [V] namely a certain conceptual caveat 
and yet a further extension, respectively. Conclusions 
are given in Sec. m 


II. CONSTRUCTING A PROPER RANDOM 
MATRIX ENSEMBLE 


After setting up the general problem in Sec. |II A[ we 
introduce the deformed Wishart ensemble and derive the 
correponding amplitude distribution in Sec. |IIB| The de¬ 
termination of the deformation functions which charac¬ 
terize the ensemble and the amplitude distribution is dis¬ 
cussed in Sec. |II C| Here, we derive the approach for the 
general case, for sake of illustration, the reader is referred 


to Ref. ITB and Sec. Ill 


A. Non—Stationary Covariances 


dimensionless price changes for K stocks. Importantly, 
we assume that there are correlations between the time 
series. In complex systems, one often encounters the sit¬ 
uation that crucial system parameters, in particular the 
covariances or correlations, are seemingly random func¬ 
tions of time [331135] , To be more precise, we consider a 
time window of length T that is much shorter than the 
total interval, T <C T to t. We now want to average over 
the subinterval [t — T + l,t\ of length T whose position 
in the total interval is determined by the time t. Sample 
averages of a function /(f) in this subinterval are then 
written as 

(Rk) T (t) = ^ /(O- (1) 

t'—t-T+l 

We are particularly interested in the covariances 

£ ki(t) = ( RkRi)r(t) - ( Rk)T{t){Ri)T(t ) 

= ( r k n) T {t ) , (2) 

where we introduced the amplitudes normalized to zero 
mean value, 

r k (t) = R k (t) — (Rk)r(t) . (3) 

We keep in mind that the resulting K x K covariance 
matrix £ (t) is calculated from time series of length T. 
We now move this time window of length T through 
the data, the resulting covariances £*,;(£) fluctuate. This 
non—stationarity has an important impact on other sta¬ 
tistical observables. In the present study, we focus on the 
distribution of the amplitudes. We now consider a time 
interval T as short as possible such that the covariance 
matrix E s in this time interval is in good approximation 
constant. We begin with addressing the case in which 
the distribution of the amplitudes is, for a given time t, 
well approximated by a multivariate Gaussian 

9(r|s - )= yd et (2^) exp (~r ts;lr ) (4) 

with the K component vector r = (?r,... ,r k ) and the 
K x K covariance matrix £ s . We suppress the argu¬ 
ment t of r and use f to indicate the transpose. We refer 
to «/(rj£ s ) as static amplitude distribution. Due to the 
correlations, a Gaussian assumption for the static dis¬ 
tribution is not as restrictive as it may seem. In the 
eigenbasis of £ s , the amplitudes only appear in linear 
combinations. Thus, for large K, the mechanisms that 
lead to the central limit theorem start working and drive 
the distributions towards Gaussians. Later on in Sec. El 
we will nevertheless relax the Gaussian assumption for 
the static amplitude distribution and look at more gen¬ 
eral functional forms. 


Suppose we have measured in a system with random¬ 
ness K amplitudes as time series R k (t),k = 1,... ,K over 
a long interval T to t of time t = l,...,T to t. For exam¬ 
ples, these amplitudes can be electric or magnetic fields 
at K different points in a disordered system, positions 
of K randomly moving particles or financial returns, i.e. 


B. Deformed Wishart Ensemble and its Amplitude 
Distribution 

How does the non-stationarity affect the amplitude 
distribution when data from the total interval T tot are 
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analyzed? — As in Ref. m, we model this by random 
matrices. As the covariance matrix is different at each 
time t where it is analyzed, we replace the covariance 
matrix in the distribution 0 by the expression 

£ s —> , (5) 

where A is a real rectangular K xN random matrix with¬ 
out any symmetries. The right hand side of Eq. 0 has 
to have the form given to ensure that it can model a 
properly defined covariance matrix. This follows directly 
from the definition ([2]). Although K, the first dimension 
is fixed, the second one, N, is for the time being a free 
model parameter. It can be viewed as the length of the 
model time series. Further clarifications will follow. To 
obtain the amplitude distribution for the total interval, 
we average over the random matrices 


( 9 )(r\E,N) 


d[A]w(A\E,N)g 



( 6 ) 


where d[A\ is the volume element, be., the product of all 
independent variables in A. Following Wishart [32J |36|, 
the Gaussian distribution 

u>(A|£) =- -rrr,- -exp [ — - tr Al£ _1 A j (7) 

V ' ’ clet^ 2 (2 tt£) V 2 / 

was assumed for the random matrices in Ref. |16| . It de¬ 
scribes the Gaussian fluctuations of the model covariance 
matrices AA^/TV about the given empirical covariance 
matrix £, which is evaluated over the total time interval 
of length T to t ■ It should not be confused with the empir¬ 
ical covariance matrix £(i) calculated in the subintervals 
[t — T+ 1, t\. The crucial difference compared to Ref. [T6] 
is a generalization of the ensemble ([7]). We introduce the 
deformed Wishart ensemble 


w(A\E,N) 


OO 

J d,T)f(rj)w ^A 

o 



( 8 ) 


which is defined by the ensemble deformation function 
f(rf) with the properties 


OO 

J f(v)drl = 1 and f(rj) > 0 . (9) 

0 

For later convenience, £ on the right hand side of Eq. ([8| 
is rescaled with N. The flutuations of the model covari¬ 
ance matrices AA ^/N deviate from Gaussian, but always 
about the empirical covariance matrix £. The meaning of 
the model parameter N now becomes clearer. It sets the 
variance for these fluctuations. The above rescaling only 
changes the functional dependencies, but not the role of 
N. We emphasize once more that £ is evaluated over the 
total time interval. Similar deformations of random ma¬ 
trix ensembles but in a Hamiltonian, not Wishart setting 
were apparently first put forward in Refs. 1271 ESI- 


After inserting the ansatz ([8]) into Eq. ([6]), we may use 
the result m 

OO 

J w(A\Y,)g(r -^AA t ^ d[A\ = J x%(z)9 ( r dz ’ 

0 

( 10 ) 

which reformulates the whole random matrix average as 
a univariate average over the \ 2 distribution 

^W= 2W r(jv/2) ^ /Mexp (~l) (11) 

of N degrees of freedom. On the mathematical side, there 
are connections between formula (101 and the calculation 


of certain distributions in scattering theory (391 [40] . Us¬ 
ing the result (10), the amplitude distribution reduces to 
the double integral 


OO OO 

(ff)( r |£, N) = J drjf (g) J dz\ 2 N {z)g (V 


-£ ■ ( 12 ) 
V 


Again, we point out the rescaling of £ with N, cf. 
Eq. (10). It is useful to rewrite that as a single integral 


(g)(r\E,N)= / p{x)g (r|x£) dx 


(13) 


by introducing the variable x = z/rj and its distribution 


OO OO 

p(x) = J dg f{r]) J dzx 2 N {z)5 (x - . (14) 

o o 

We refer to it as amplitude distribution deformation func¬ 
tion. One easily obtains 

OO 

P{X) = 2”/*T(N/2) J dr lfWv N/2 ex p (~f) , (15) 


which establishes the relation between the two deforma¬ 
tion functions. 

We notice that the ansatz ([8]) restricts the form of 
the deformed distribution uJ(A[£, N) to functions of 
trAf£ -1 A only. Even though the inclusion of further 
terms such as tr(Al£ _1 A) 2 is likely to improve the qual¬ 
ity of the data fits, we stick to the ansatz (|8]). Its consid¬ 
erable advantage is the guaranteed but otherwise ques¬ 
tionable analytical tractability as will be shown in the 
sequel. Moreover, further terms will also increase the 
number of deformation functions which will hamper their 
unambiguous determination. 


C. Determination of the Deformation Functions 

Apart from the deformation functions, the distribu¬ 
tions u;(A|./V£) and (g)(r |£,iV) depend on the usual co- 
variance matrix £ analyzed by sampling over the total 
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interval. We notice that the corresponding covariance 
matrix in the deformed ensemble slightly differs from 
that. By definition we have 


meaningful results, it is out of question to compare the 
matrices directly, i.e. their individual matrix elements. 
A better observable is the distribution 


E (d ) = <^AAt) 

= J ^AA^w(A\E,N)d[A] . (16) 

Inserting Eq. ([ 8 ]), we can do the ensemble average in the 
Gaussian case which yields the covariance matrix NE/g. 
Thus, only the g integral remains and we have 

OO 

E (d) = 7VE f — dg = NErr 1 (17) 

J V 
0 

implying that the two covariance matrices differ by the 
average of I/ 77 . Alternatively, one can calculate E^ from 
the amplitude distribution, 

E( d ) = ( rr t) = J rr^ (g)(r\E,N)d[r] , (18) 

which yields 


OO 

E^) = Y, J xp{x)dx = "Ex 


(19) 


q(s) = j S (s- — tr AA^j w(A\E, N)d[A) (22) 

of the traces, which can easily be written as a single in¬ 
tegral involving the ensemble deformation function f(g). 
The distribution (22) is empirically obtained by moving 


a time window through the amplitude time series and 
calculating the empirical covariance matrices and their 
traces. This then gives f(g). 

The problem with the above procedure is its still lim¬ 
ited statistical significance. Instead, extracting the am¬ 
plitude distribution deformation function p{x) from the 
data gives much more meaningful results. As we discuss 
in the sequel, the number of data points is larger by a fac¬ 
tor of K. The amplitudes i'k appear in Eq. (13) only via 
the bilinear form r'Er. We rotate the amplitude vector r 
into the eigenbasis of the empirically obtained covariance 
matrix E. By definition, the eigenvalues of E are positive 
and larger than zero, provided the length of the sampling 
interval is larger than I\. We divide each component of 
the rotated amplitude vector by the square root of the 
corresponding eigenvalue and denote the resulting vector 
by f. Within our model, all components of r should be 
equally distributed. We integrate out all but one, fk, and 
arrive at the distribution 


Here, the two covariance matrices differ by the first mo¬ 
ment of x. With the help of Eq. (14), the results ( 17|19 ) 
are easily seen to coincide. 

Having extracted the covariance matrix for the total 
time interval from the data, we can proceed with the 
determination of the deformation functions. The expo¬ 
nential function in the integrand of Eq. (151 allow us to 
interpret it as a Laplace transform, 


r( " /2) . (20) 

where we introduced g = ??/2 to avoid inconvenient fac¬ 
tors of two. Thus, the ensemble deformation function is 
the inverse Laplace transform 


(9)(fk) = J ex P (~^) dx ■ ( 23 ) 

0 

Thus, p(x) may be identified with the distribution of the 
variances x of the Gaussian distributed random variables 
fk- Conceptually, this is our main result. It provides a 
simple and statistically significant method to obtain the 
amplitude distribution deformation function p(x) which 
then yields upon inverse Laplace transform the ensemble 
distribution function fig). As we have I\ time series 
r-fc(t), we gain a factor of K by aggregation. 


III. APPLICATION TO FINANCIAL DATA 


/( 2fj) 


T(N/2) 1 x ( p(x) \ 

2 ip/* yx 1 */ 2 - 1 ) ' 


( 21 ) 


of the amplitude distribution deformation function di¬ 
vided by a power of x. This makes it possible to deter¬ 
mine f(j]) by extracting p(x) from the amplitude time 
series and carrying out the inverse Laplace transform. In 
contrast, extracting f(g) directly from the data is cum¬ 
bersome and burdened by limited statistics, as the fol¬ 
lowing discussion shows. The rows of A are the model 
time series of length N and cannot easily be identified 
with the amplitude time series r/-(f) of lenght T. How¬ 
ever, the matrices AA' /N form the ensemble of model 
covariance matrices and can be compared with the em¬ 
pirical ones. As a certain sample length is required for 


We now apply our method to stock market data. This 
is of particular interest as heavy tails are ubiquituous in 
finance. A better modeling for multivariate distributions 
is urgently called for to improve risk estimation. We 
present the data in Sec. Ill A extract the deformation 
functions in Sec. IIII Bl and calculate the ensemble and 
return distributions in Sec. IIII Cl 


A. Data Set 

We analyze the K = 306 continuously traded stocks 
with prices Sk{t), k = in the S&P 500® in¬ 

dex between 1992 and 2012 0T], which we previously 
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analyzed with a purely Gaussian, i.e., non-deformed 
Wishart ensemble [T6]. The amplitudes are here the di¬ 
mensionless price changes 


Rk(t) 


Sk(t + At) - Sk(t) 


(24) 


which are referred to as returns. They depend on the cho¬ 
sen return horizon At. According to Eq. ([3|, we calcu¬ 
late the returns (t) normalized to zero mean. To make 
our presentation self-contained, we show once more how 
strongly the whole K x I\ correlation matrix C for this 
data set changes in time. In Fig. [T] it is displayed for 
subsequent three-months time windows. In most ran- 




FIG. 1: Correlation matrices of K = 306 companies in 
S&P 500® for the fourth quarter of 2005 and the first 
quarter of 2006. The darker, the stronger the 
correlation. The companies are sorted according to 
industrial sectors. Taken from Ref. m- 

dom matrix approaches, the ensemble is fictitious and 
enters only by means of an ergodicity argument. This 
is not so here, as Fig. [l] illustrates. Our ensemble ex¬ 
ists in reality, it is the whole set of matrices analyzed 
by moving a sample time window through the data. In 
Fig- 0 one also sees rather stable stripes in these corre¬ 
lation matrices which are due to the different industrial 
sectors, see, e.g ., Ref. m- Obviously, basis invariance 
is not present in this data set, and probably neither in 
any other real data set. Hence, a direct extraction of 
the ensemble deformation function f(j]) which preserves 
the basis invariance of the random matrix ensemble is 
problematic. Yet, there is still another reason: market 
states were identified which reveal a fine structure of the 
ensemble P0. As every random matrix ensemble has 
an effective character, one is advised to analyze quanti¬ 
ties which already reflect this. In the present case, such 
quantities are the amplitude, in the present case return, 
distribution and the corresponding deformation function 
p(x). 


B. Deformation Functions 

We use daily data, i.e., At = 1 trading day. Rotation 
of the return vector r into the eigenbasis of the empirical 


covariance matrix E, normalization to the square roots 
of the eigenvalues and aggregation on a five-day window 
yield the empirical distributions of variances shown in 
Fig- i Aggregation on a ten-day window gives similar 



x 



x 


FIG. 2: Aggregated distribution of variances as dots, 
calculated on a five-day time window on a linear (top) 
and logaritmic (bottom) scale. Fit to a beta prime 
distribution as solid line. For comparison, a y 2 
distribution as dashed dotted line. 


results; hence, the estimation noise does not have a ma¬ 
jor impact on the distribution. A variety of functions is 
capable of describing the data. In finance, one often em¬ 
ploys the log-normal distributions to model volatilities, 
see e.g. Ref. [33]. In finance, the standard deviations 
are referred to as volatilities. However, the log-normal 
distribution fails to capture the empirically found tail be¬ 
havior. More suitable is the beta prime distribution 


p(x\N,L) 


T(N + L/2) ^.iv/2- 1 

r(iV/2)r((A + L)/ 2) (1 + X ) N + L / 2 


(25) 


with two positive parameters N and L. Anticipating the 
following discussion, we choose their combination in the 
expression ( [25] ) in such a way that N coincides with the 
parameter N introduced in Sec. [H] The fit is depicted 
in Fig. [2] the agreement with the data is much better 
than for a % 2 distribution corresponding to the ensemble 
of Ref. [16. which is formally obtained by setting f{rj) = 
S(ri — 1) or f(rj) = S(ri — A), respectively, depending 
on the rescaling with N. We carry out fits for different 
return horizons At. The results for N and L are shown in 











6 


30 r 
25 : 

_j 20 : 

T3 

S 15 : 

2 io: 

5 : 


• i 


■ i 


10 

At 


15 


20 


FIG. 3: Fitted values of N (upper, ascending dots) and 
L (lower dots) for the beta prime distribution versus 
the return horizon At. Triangles and diamonds for the 
fits with constraint to integer values, suares and circles 
without constraint. 


Fig-! While L stays constant around two, N increases 
from about seven for daily data to about 23 for At = 19 
trading days. We postpone the interpretation up to the 
evaluation of the ensemble and return distribution. 

Having extracted the amplitude, i.e., return distribu¬ 
tion deformation function p(x\N, L ), we calculate the in¬ 
verse Laplace transform ©, 

f(2fj\N L) = _ r(IV + L/2) _ x / \—n—l/2\ 

nV 1 ’ J 2fj N / 2 T({N + L)/2) j 

( 26 ) 

and find [44] with r\ = 2fj for the ensemble deformation 
function 

r j(N+L)/2—l . jy. 

f(v \ N , L ) - 2 ( N +L)/ 2 t((N + L)/2) exp V 2/ 

= Xn+lW ■ ( 27 ) 

This is a Xn+l distribution with N + L degrees of free¬ 
dom. As required, f(rj\N, L) is a positive and normalized 
function. 


C. Deformed Ensemble and Return Distribution 


Inserting Eq. \27\ into Eq. (|8j) yields after a straight- 
T((N + NK + L)/2) 


forward calculation 
w{A\E,N,L) = 


T((A + L)/2)det JV/2 (7r7VE) 

trA^A \~ {N+NK+L)/2 


1 + 


N 


(28) 


for the distribution of the random matrices A. Thus, we 
arrive at an ensemble characterized by an algebraic dis¬ 
tribution. For a similar ensemble, but in the special case 
of E = lx, spectral correlation functions were studied 


in Refs. [451 146] . Here, however, we derived our ensem¬ 
ble from data, and the dependence on a non trivial E is 


in the present essential. Anticipating the result (28), we 


rescaled E with N as compared to Ref. (TBJ. Thereby, N 
and L appear on equal footing in the formulae. To ob¬ 
tain the ensemble averaged return distribution, we plug 
Eq. (25) into Eq. ( fi~3| and find 


(g)(r\E,N,L) = 


T(N + L/2)T((N + K + L)/2) 


U 


T(N/2)T((N + L)/2) 1 /det( 7 r7VE) 
N + K + L K-N + 2 r t E" 1 r\ 

(zy) 


with the confluent hypergeometric function m 

oo 

U(a,b,z) = —[ y a ^ 1 {l + y) b ~ a ~ 1 exp(-yz)dy (30) 

r(a) J 


for positive real parts of a and z. From Eq. ( [T7| or (19) 
the covariance matrix 


E (d) = 


N 


N + L-2 


(31) 


for the deformed ensemble follows. To compare with 
the empirical return distribution, we compute the inte¬ 


gral (23) 


(9)(fk\N,L) = 


T(N + L/2)r((N + L + l)/2) 


T(N/2)T((N + L)/2)v / 27r 

U\ - T ' ■ (32) 


A comment on the permissible values for the parameter 


N is in order. In the return distributions (29) and (321, 


N can take any positive real value. In the ensemble dis¬ 
tribution (28), however, N is the length of the model 


time series or, equivalently, one of the dimensions of the 
K x N matrices A and thus restricted to integer values. 
It is thus a matter of interpretation whether one wants 
to impose the constraint that N be integer. There is no 
such restriction for the parameter L. In any case, we also 
carried out fits with the integer constraint. The results 
shown in Fig. [3] do not indicate a strong influence of this 
constraint. 

The results of the data comparison are displayed in 
a for daily returns, At = 1 trading day. The fitted 
parameter values are N = 8.13 and L = 2.24. The center 
of the empirical distribution is slightly better described 
by employing the deformed Wishart ensemble instead of 
the non-deformed one in Ref. iE] . The heavy tails clearly 
reveal that the deformed Wishart ensemble yields overall 
a better description, since the result of Ref. m con¬ 
sistently underestimates the large events. In Fig. a we 
present the same analysis for returns with At = 20 trad¬ 
ing days, the fit gives N = 20.98 and L = 2.07. Here, 
the tails are still strong, but less pronounced than for 
daily data. For the interpretation of these results, we 
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FIG. 4: Aggregated distribution of daily returns, 

At = 1 trading day. Empirical results as dots, fit to the 
distribution p2| as solid line. The corresponding result 
of Ref. [15] as dashed line. Center of the distribution on 
a linear scale (top), whole distribution on a logarithmic 
scale (bottom). 


perposition of the amplitudes, in the present case the 
returns, drives the multivariate distribution towards a 
Gaussian, provided that the covariances are sufficiently 
constant. Second, as observed in Ref. nn and extended 
here, the fluctuations of the non-stationary covariances 
lift the tails of the distributions evaluated over long time 
intervals and make them heavier. Not surprisingly, the 
heavier the tails of the univariate distributions, the heav¬ 
ier are also those of the ensemble averaged multivariate 
ones shown above. This is nicely reflected in the nearly 
linear increase of the parameter N on the return hori¬ 
zon At, see Fig. [3] The smaller N, the heavier are the 
tails in the ensemble distributions (28) and in the enesm- 
ble averaged return distribution Q29|) . As N grows and 
L is held fixed, the distribution ( |28[ ) comes closer to a 
Gaussian, i.e., to the non-deformed ensemble. It is quite 
remarkable that the fit of L always yields values close to 
two. According to Eq. (31), this implies ss £. Put 
differently, the heavy tails in the cases considered alter 
the measured covariances only slightly as compared to 
a Gaussian assumption. When looking at financial risk, 
however, the tails are very important. 

Finally, we use the opportunity to discuss an issue of 
general interest when presenting and fitting a multivari¬ 
ate distribution that depends on the statistical variables 
only via a bilinear form such as rf£ _1 r. Instead of the 
above procedure which involves rotation of r into the 
eigenbasis of £ and aggregation, one might also view the 
bilinear form as a generalized radius 


P 


= Vrt£-! 


(33) 


in the K dimensional space and study its distribution 



h 


FIG. 5: Same as Fig. [4] (bottom) for returns with 
At = 20 trading days. 


recall the well-established fact that univariate distribu¬ 
tions of returns for one stock acquire heavy tails as the 
return horizon At becomes smaller, see e.g. Ref. TFl . 
Here, however, we analyze the multivariate distribution 
of K correlated stocks. Thus, there are two competing 
effects. First, as discussed in general in connection with 
Eq. © and for the financial data in Ref. nn, the su- 


(3rad)(p) = J 5 (p- VrtE-ir) (g) (r|£, N, L)d[r) . 


A rather straightforward calculation yields 


(34) 


T(N + L/2)T((N + K + L)/2) 

\i/rad/ \P) 2 k / 2 - 1 T{N/2)T{K/2)T((N + L)/2) 

P K -^U(^±±4) . (35) 

The Jacobian p A_1 , typical for such a radial distribu¬ 
tion, appears and, because of K = 306, dominates the 
functional form of the distribution for small p, as can 
be seen in Fig. [6] The theoretical result (35) describes 
the data much better than the corresponding result of 
Ref. [16]. Nevertheless, the dominance of p A_1 gives a 
somewhat misleading picture and we infer that using the 
distributions (29) and (32) is more appropriate. 


IV. PERMISSIBILITY OF DEFORMATION 
FUNCTIONS 

When extracting the return distribution deformation 
function p{x) from the data, we encountered a puzzling 




















p 



FIG. 6: Radial distribution of daily returns, At = 1 
trading day. Empirical results as dots, fit to the 
distribution (|35| as solid line. The corresponding result 
of Ref. m as dashed line, on a linear (top) and a 
logarithmic scale (bottom). 


must be positive definite. We test this by calculating the 
distribution of the traces 


u(s) 





w(A\N,c)d[A] . (38) 


As we only wish to test the positivity, it is convenient 
to choose it different from the above distribution (22) by 
including the covariance matrix. After some algebra, we 
can express it as a high-order derivative involving the 
return distribution deformation function, 


u(s) 


(_!)(*—iW2 r( jy /2) 

T(KN/2) 

j(A-l)AT/2 

ds (K-l)N/2 gN/2-1 ' 


This in principle general result gives for the log-logistic 
distribution ( [36] ) 


u 


(_l)(A--iW 2jVc Ar /2 T{n/2) 2 _ x 

2T(KN/2) 

d (K-l)N/2 1 

d S ( K ~ 1)^/2 ( c iV/2 + s 7V/2)2 


• (40) 


Restricting ourselves to even N , we may employ the the¬ 
ory of complex functions to calculate the pole expansion 


1 

(c N / 2 + s N / 2 ) 2 


E 


n—1 


l 


rdm^n 


( a n 


a m ) 2 


problem that we wish to report here. The log-logistic 
distribution 


p(x\b, c) 


b (a :/c) b 1 

c (1 + (x/c) b ) 2 


(36) 


1 


(s - a n ) 2 




(41) 


with the poles 


with b = N/2 yields a very good description of the data, 
even slightly better than the beta prime distribution. 
The c values are around one, N = 4 for At = 1 trading 
day and increasing for larger At. However, the result¬ 
ing ensemble deformation function can take positive and 
negative values. For example, for N = 4, we find 


f(v\c) 


s'm(r]/2) — r]cos(r]/2)/2 


(37) 


Hence, it cannot be interpreted as a distribution. In¬ 
deed, we are confronted with a problem of interpretation. 
While the discussion in Sec. |II C| clearly revealed that 
p(x) is a well-defined distribution of a random variable, 
namely of a variance, it is not obvious that f(rf) also rep¬ 
resents a well-defined distribution. The corresponding 
random variables, the matrices A, do not have a direct 
data interpretation. Thus, one might simply view f{rf) as 
a continuous coefficient function for the expansion of the 
distribution ([8]) in terms of Gaussians. Nevertheless, even 
if one does not want to enforce an interpretation of f(if) 
as a distribution, the resulting ensemble distribution ([8]) 


, i.2-n 

a n = cexp ( -^~A n + 1) 


(42) 


The derivatives in Eq. (40) can now easily be evaluated 
and we arrive at 


= Nc n / 2 T(N/2) kn / 2 -i 
1 ’ 2T(KN/2) 


E 2 1 ( F((K - l)N/2 + 2) 

h IW„K - a m ) 2 {(a a n )( K ~ 1 W 2+2 

2T((K-l)N/2 + l) v 1 \ 

( s _ an )(A-i)iV/2 + i 2^ an _ ai j A*) 


Inspite of the complex poles, this is by construction a 
real function. Yet, it takes positive and negative values 
which outrules an interpretation of u(s) and thus also of 
w(A\N, c) as distributions. By means of this example we 
face the somewhat surprising result that a well-defined 
distribution p(x) does not necessarily yield a well-defined 
ensemble. Each case has to be investigated individually. 
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V. FURTHER EXTENSION BY DEFORMING 
THE STATIC AMPLITUDE DISTRIBUTION 


We argued in Sec. [TTa] that the Gaussian assump¬ 
tion Q for the static amplitude distribution is not as 
restrictive as it might appear at first sight. Nevertheless, 
we now extend our construction by assuming more gen¬ 
eral functional forms. At present, we do not have data 
at our disposal in which the static amplitude distribu¬ 
tion is non-Gaussian, but we nevertheless now extend 
our construction, as it might be useful for future data 
analyses. Moreover, we will also come across some inter¬ 
esting observations. Instead of Eq. Q, we now assume 
that the static amplitude distribution can be expressed 
as an average over the Gaussian Q, 


OO 

5 (r|E s ) = J h{£)g 


j K 


(44) 


with a new deformation function /i(£) that fulfils 

OO 

J h(Z)d£ = 1 and > 0 . (45) 


We proceed as in Sec. m Instead of the ensemble aver¬ 
age (k 3|), we now have 


( 3 )(r|S,iV)= / d[A]w(A\X,N)g(r 


This is readily cast into the form 


— AA^ 
N 


(46) 


(g)(r |£, N) = J p{x)g (r|a;£) dx , (47) 

o 


which differs from Eq. (13) only by the definition of the 


amplitude distribution deformation function. It is now 
given by 

OO OO OO 

p(x) = J d£h{Z) J di)f{r)) J dzx 2 N (z)5 (z - ^) . 

0 0 0 

(48) 

For fixed £, we introduce the new variable f/ = r/£ and 
find 


This integral is reminiscent of a convolution. Thus, the 
case of a deformed, non-Gaussian static amplitude dis¬ 
tribution is formally traced back to the Gaussian case. 
The difference can be fully absorbed into the ensemble 
deformation function. Importantly, this means that all 
other results of Sec. [TT| continue to hold, in particular the 
Laplace transform ( |20[ ) and its inversion ©. Never¬ 
theless, the following problem remains. We can extract 
h(£) and p(x) from the data by using the methods out¬ 
lined in Sec. |IICl for very short time intervals and for the 
whole, long time i nterv al, respectively. From the inverse 
Laplace transform (21), we obtain /(?)), but to determine 
f(r]), we are left with the task to invert Eq. (50). Al¬ 
though that is definitely possible for some special cases, 
a general inversion formula is lacking. In practical appli¬ 
cations, however, the extension sketched above is more 
likely to be needed for consistency tests. For example, 
if some of the available data for the same system permit 
the Gaussian assumption for the static amplitude distri¬ 
bution and others do not, one can first determine f(rj) 
as described in Sec. |n] and then turn to the data which 
require an additional deformation function /i(£). Once 
both of these deformation functions are known, one can 
evaluate f(fj) and check if it is consistent with the in¬ 


verse (21) of p(x) which is independently extracted from 


the data. 

As an example, we consider the case that both, /(?y) 
and h(£), are y 2 distributions 

f(v) = Xn+l(v ) and MO = Xm( 0 (5 1 ) 

of N + L and M degrees of freedom, respectively. The 
choice for /(ry) coincides with the result of Sec. 

With Eq. ([50l> , we obtain 


IIIB 


f(v) = 


Vf> 


(JV+L+M)/2-2 


fc(N+L-M )/2 (VO 


2 (iv+£+M)/ 2 -i r ((jV + L)/2)T(M/2) 


(52) 


where K, v is the modified Bessel function of the second 
kind of order v. This function already appeared in the 
ensemble averaged return distribution of Ref. |16| . Ac¬ 
cording to Eq. (49), the distribution p(x) is an integral 


involving the modified Bessel function and the return dis¬ 
tribution averaged over the deformed ensemble is an in¬ 
tegral over a product of modified Bessel functions, but 
we do not give the formulae here. 


VI. CONCLUSIONS 


OO OO 

p(x) = J drjf (fj) J dzx 2 N {z)5 (x - . (49) 

o o 


which coincides with Eq. (141, but now involving the new 
ensemble deformation function 


m = J m ■ (so) 


Non-stationarity is an often encountered feature in 
complex systems. Here, we addressed non-stationarity 
of correlations. We presented a method to determine 
their distribution from the amplitude distribution. Put 
differently, we showed how to extract the proper ensem¬ 
ble of random covariance matrices from amplitude data. 
Carrying out our analysis for the case of financial data, 
we found an algebraic distribution of covariance matri¬ 
ces reflecting the heavy tails in the amplitude, i.e., return 
distributions. 
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A conceptually important comment is in order. Con¬ 
sider any two empirical covariance matrices £(fi) and 
£(£ 2 )- They are certainly not independent, because, first, 
the correlation structure due to, e.g., the industrial sec¬ 
tors in the financial markets only changes on a large time 
scale and, second, one would expect that they are the 
more dependent the smaller the time difference \t\ — < 21 - 
The first cause is a built-in feature of our model, as the 
random model covariance matrices fluctuate around the 
empirical £. The second cause is effectively accounted for 
as the length of our model time series N, i.e., the second 
dimension of the random matrices A , is different from 
the length T of the subintervals in which the empirical 
covariance matrices £(t) are evaluated. The values for N 
resulting from the data analysis are very small N <C K 
while T > K when measuring £(£). Our random ma¬ 
trix ansatz does not aim at modeling the ensemble of 
the empirical covariance matrices £(t) in a one-to-one 
fashion. This is never the goal of a statistical approach. 
Our model has an effective character which is obvious, 
for example, in the relation N <C T. Model time series 
much shorter than the empirical ones suffice to properly 
grasp the statistical effects induced by the fluctuations 
around the average covariance matrix. This also reflects 
the mutual dependence of the empirical covariance ma¬ 
trices. We demonstrated by our analysis of financial data 
how useful our model is. 

Furthermore, observables such as the amplitude dis¬ 
tribution do not resolve autocorrelations in time of any 
kind. Even reshuffeling the order of the empirical covari¬ 
ance matrices £(£) in time does not change the amplitude 
distribution for the total time interval. A similar situa¬ 
tion is encountered when analyzing statistical properties 
of quantum chromodynamics. The gauge fields may be 
viewed as a really existing ensemble over which an av¬ 
erage is carried out — this is the very definition of the 
partition function. Although these gauge fields are not 
independent either, an effect of this autocorrelation is 
only seen if corresponding observables are used. Den¬ 
sities are not affected. Random matrices as models for 
the highly non-trivial gauge fields are applied with great 
success. 

Yet another aspect is worth mentioning. In contrast 
to random matrix applications for Hamiltonian systems, 


there is not a local scale that can enforce universal sta¬ 
tistical behavior. Thus, it is important how the covari¬ 
ances are actually distributed. Gaussian assumptions are 
only acceptable if really justified by data analysis. The 
present study extends a previous one in which we em¬ 
ployed such a Gaussian assumption in finance. Here, we 
reconsidered the same data set and clearly demonstrated 
that the Gaussian assumption underestimates the tails. 
The algebraic distributions that we found here are rele¬ 
vant for risk estimation as they help to better understand 
large events. Importantly, once the ensemble is properly 
extracted, meaningful averages can be computed for all 
observables that depend on non—stationary covariances. 

When developing our construction, we came across a 
puzzling feature which calls for a caveat. The deforma¬ 
tion function extracted from the amplitude distribution 
determines, on the one hand, uniquely the ensemble, but, 
on the other hand, this ensemble is not necessarily well- 
defined. Each case has to be studied individually. We 
do not expect this to cause severe problems in applica¬ 
tions, but conceptually it is an interesting aspect. We 
also extended our construction by including deformed 
static amplitude distributions. The additional freedom 
accompanying this extension might offer a possibility to 
circumvent the above mentioned puzzling problem. From 
a more general viewpoint, we have to emphasize that our 
construction only includes functional forms of the ensem¬ 
ble that depend on the trace over the product of the 
random covariance matrix and the mean covariance ma¬ 
trix. Although this is quite natural, as it guarantees a 
certain amount of basis invariance which all random ma¬ 
trix models need, more general functional forms pose an 
interesting and potentially important challenge. 

Hitherto, we only applied our method to finance. We 
plan applications to other complex systems, too. This 
may be rewarding, as large events and risk estimations 
are not only important in finance. 
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